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Abstract. - We provide numerical indications of the g-generalised central limit theorem 
that has been conjectured (Tsallis 2004) in nonextensive statistical mechanics. We focus on N 
binary random variables correlated in a scale-invariant way. The correlations are introduced by 
imposing the Leibnitz rule on a probability set based on the so-called q- product with q < 1. We 
show that, in the large N limit (and after appropriate centering, rescaling, and symmetrisation) , 
the emerging distributions are g e -Gaussians, i.e., p(x) oc [1 — (1 — q e ) f5{N)x 2 \ L ^ 1 ~ q '\ with 
q e = 2 — -, and with coefficients f3(N) approaching finite values P(oo). The particular case 
q — q e = 1 recovers the celebrated de Moivre-Laplace theorem. 



Introduction. - The central limit theorem (CLT) is a cornerstone of probability theory 
and is of fundamental importance in statistical mechanics. This important theorem implies, 
roughly speaking, that any sum of N independent random variables will tend, as N — > oo, 
to be distributed according to a certain law (which behaves as an attractor in the space 
of distributions). When the distribution of the individual random variables has any finite 
variance, the attractor for the sum will be a normal (Gaussian) distribution [1], and this is 
the result usually known as CLT (from now on denoted G-CLT). Several extensions of the 
CLT exist, such as the one due to Gnedenko, Kolmogorov, and Levy [1] (from now on denoted 
L-CLT), widely known in physics because of its relation with anomalous diffusion [2]. This 
extension states that the sum of independent infinite-variance variables will be attracted to 
Levy distributions. The G-CLT explains the frequent occurrence of normal distributions in 
nature. Its first manifestation in mathematics was due to Abraham de Moivre in 1733, followed 
independently by Pierre-Simon de Laplace in 1774. The distribution was rediscovered by 
Robert Adrain in 1808, and then finally by Carl Friedrich Gauss, who based on it his famous 
theory of errors [3]. A central result is the fact that the binomial distribution approaches, 
for N — > oo and after being appropriately centralised and rescaled, a Gaussian. This can be 
considered as the first historical manifestation of the G-CLT. It is frequently referred to as the 
de Moivre-Laplace theorem. It is this relation that we aim to generalise here by allowing for 
the presence of scale-invariant global correlations (previous attempts along similar lines are 
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reviewed in [4]). We thus suggest an explanation of the frequent occurrence of g-Gaussians in 
natural and artificial systems [5,6]. The basic statistical-mechanical program is still essentially 
Boltzmann's program in fact, until now only partially fulfilled despite a common belief to 
the contrary. It consists of (i) constructing, from microscopic dynamics, the probabilities 
of occupancy of phase space for a given (typically large) time t and a given (typically large) 
number of elements N, and (ii) deriving, from these probabilities, the attractor in distribution 
space, the entropy, and all other thermodynamical quantities. The present paper addresses a 
relevant aspect of the second step only, namely the N — > oo limit for fixed (typically large) t. 

The g-Gaussians are distributions that naturally emerge within the framework of "nonex- 
tensive statistical mechanics" [6]. They are defined by p(x) oc e~ !ix = [1 — (l — q)/3x 2 ] 1 /( 1 ~ q \ 
where (i is a positive constant characterising the width. They optimise ( 1 ) the entropy 

1— f dx[p(x)] q 

S q = — J q _ 1 (with Si = Sbg = — J dxp(x) In p(x), where BG stands for Boltzmann- 

Gibbs) under simple constraints [7] . It has a compact support for q < 1 , recovers the Gaussian 
distribution for q = 1, and decays asymptotically as a power law for 1 < q < 3; p(x) is not 
normalizable for q > 3. Its variance dx x 2 p(x) is finite for q < 5/3, and diverges for 
5/3 < q < 3. Its ^-variance dx x 2 [p(x)] q / dx [p(x)] q remains finite for q < 3. It 
recovers the i-Student distribution with I degrees of freedom if q = (3 + Z)/(l + I). For I = 1, 
hence q — 2, we get the Cauchy-Lorentz distribution. 

The frequent occurrence of these ^-distributions can be easily understood if some new CLT 
(from now on denoted g-CLT) exists. The already known theorems do not explain this quasi- 
ubiquity. Indeed, the convolution of N independent such distributions leads, for N — > oo, to 
Gaussians if q < 5/3, and to Levy distributions if 5/3 < q < 3. Therefore, these N variables 
must be strongly correlated for the ^-distributions to be stable under convolution, i.e., to 
constitute attractors in the space of distributions. In other words, a new theorem would be 
very welcome. Such a possibility was already discussed in [8] and recently conjectured in 
detail [9]. In the light of the arguments in [10], it seems natural to think that strictly or 
asymptotically scale-invariant correlations will yield a suitable q-CLT. We have not shown so 
far that it is so, but we present here a ^-generalisation of an important manifestation of the 
G-CLT, namely the de Moivre-Laplace theorem. 

Model. Let us consider the simple case of N identical and distinguishable binary 
random variables. These variables are not necessarily independent, and we denote by r/v,n 
the associated probabilities. We have N sets of probabilities with (N + 1) elements each, and 
n = 0,1,2, ... ,N as the variable index within each set. We construct these sets with a special 
correlation relating the (A r + l)-set to the iV-set, in such a way that the system has a particular 
scale invariance. The probabilities are correlated across different system sizes, the marginal 
probabilities of the iV-system being identical to the joint probabilities of the (N — l)-system. 
More particularly, we impose the Leibnitz rule, soon to be defined. 

The trivial case is that of independence. Consider the Pascal triangle, a number triangle 
whose rows are formed by the binomial coefficients (^) = nv~)Tw! • ^ nc se ^ {(n)/^} 
constitutes a probability set for any fixed N. In the limit of N — > oo and after appropriate 
centralisation and rescaling, this set approaches a Gaussian distribution. As mentioned earlier, 
this is known as the de Moivre-Laplace theorem. If each one of the binary variables has 
probabilities p and 1—p, the elements of this triangle for fixed N will be given by {( I ^)p N ~ n (l — 
p) n }. The previous simple case (Pascal triangle) corresponds to p = 1/2. As for one of the 

(^To be more precise, they maximise (minimise) S q whenever it is a concave (convex) function, i.e., for q > 
(for q < 0). Let us also mention that, for q < 0, only states with nonzero probability enter into the calculation 
of S a . 
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systems studied in [10], we now construct our probabilities by imposing the following rule: 

r N ,n + r N ,n+-L=rN-i,n (n = 0, 1, N - 1; N = 2, 3, ...)■ (1) 

This rule, already referred to as the "Leibnitz rule" , is the one used to build the Leibnitz 
Harmonic Triangle [11]. Note that every probability from row N — 1 is the sum of two 
probabilities from row N. Furthermore, the Leibnitz rule ensures by construction that for any 
set of N variables, the sum of the probabilities (each one multiplied by the degeneracy factor 
given by the appropriate binomial coefficient) will always be equal to the corresponding sum 
for the previous row. This means that if the (N — l)-th row sums to unity, so does the iV-th 
row . We thus verify that £^ =0 (£) r N , n = 1 (r*,„ € [0, 1]; N = 1, 2, 3, n = 0, 1, N). 

Within this procedure, the knowledge of all the elements in row (N — 1) and any element 
of row N completely determines the other N elements of row N. Using Eq. we can 
analytically calculate all the probability elements of all rows and obtain 

N , . 



=N- 



where each rjv,o is an arbitrary probability value. 

The only remaining question is how to choose the set {r^.o}- In the case of probabilistic 
independence we simply have rjv.o = p x p x . . . x p = p N (0 < p < 1; N = 1, 2, 3, ...) and thus 
tn,u — p N ~ n p n (n = 0, 1, 2, N). The generalisation we shall propose here is based on the 
q-product [12]: 

x® q y = [x 1 -i + y 1 -i-l] 1 '^-i) (x,y>l;q<l). (3) 

This generalised product has the following properties: (i) x <£>i y = xy; (ii) x ® q l = x; 
(iii) hiq(x <g) 9 y) = hi q x + ln g y, with ln 9 a; = x (hii x = \nx) being the inverse of e q ; 

(i y ) x® y = (^) ®2-q (^)- If the probability distribution in phase space is uniform within a 
volume W, the entropy S q is given by S q = hi q W. Property (iii) can then be interpreted as 
S q (A+B) — S q (A) + S q (B) where A and B are subsystems that are not independent but rather 
satisfy Wa+b — Wa ® q Wb- This fact connects the present work with [10]. The possibility 
of a correspondence between this q — product with a q-CLT has already been conjectured [9], 
and some efforts along this line already exist in the literature [13]. 

Let us now proceed with our ^-generalised de Moivre-Laplace theorem. We choose 

(l/rjv,o) = (VP) ® q (1/p) ® q (VP) , (4) 

hence 

T N fi = P ®2-q P ®2-q P ®2-q ■ ■ ■ ®2-q P = 1/ [Np*- 1 - (N - 1)] 1 /(1~9) . ( 5 ) 

For < p < 1 we see that r Nfi = P N = e - Nln ^M if g = 1, whereas r Nfi ~ [^/py-^^i/d-^ N i/\i- q ) 



oc 



1/NV0--4) (AT -> oo) for q < 1. Combining Eqs. © and (5), we obtain 



JV 



-N — 7 



i- N + nJ [j_ (i_ l)p^-i}-- 



Note that (q,p) — (0, 1/2) reduces to the usual Leibnitz triangle (i.e., rjvo = 1/(N+ 1)) [11]. 
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Results. We studied our model numerically as a function of the index q for typical 
values of p and N >> 1. To calculate the probability values r^^n from Eq. © we used 
an arbitrary precision library [14] in order to overcome the effect of the alternating series 
(i.e., subtraction of almost equal large numbers), whose relative error grows very rapidly with 
the number of elements N. For example, for N = 300 and N = 1000 we used respectively 
150 and 500 significant decimal digits. For p = 1/2, N >> 1 and q < 1, the probabilities 
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Fig. 1 - ln_4/ 3 vs x 2 for (q,p) = (3/10, 1/2), and N = 1000. Two branches are observed due to 
the asymmetry emerging from the fact that we have imposed the (/-product on the "left" side of the 
triangle; we could have done otherwise. The mean value of the two branches is indicated in dashed 
line. It is through this mean line that we have numerically calculated q e (q) as indicated in Fig. 3. 
In order to minimise the tinny asymmetry, we have represented a variable x slightly displaced with 
regard to n ~ j ^' 2 ' > so that the center x = precisely coincides with the location of the maximum of 
p(x). INSET: Linear-linear representation of p(x). 

(rD rAr . n nea tly approach (see Figs. 1 and 2) the q e -Gaussians p{x) = A(q e )y/j3 e~f ^ , where 
A(q e ) is determined through normalization, and x = - j^jl 2 ^ is a conveniently centered and 
rescaled variable. The value of q e is obtained by plotting ln 9c [p(x)/p(0)] versus x 2 and finding 
the value of q e which produces the largest linear correlation coefficient (see Fig. 1). This is 
repeated for typical pairs (q,p). We see that there is some asymmetry in the distribution. 
More precisely, the x > and x < branches lead to the same q e , but the corresponding 
slopes (3 are slightly different. This asymmetry depends on (q,p,N). Our main focus being 
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the index q e , we calculate the mean of both branches, and then we fit as illustrated in Fig.^ 
In Fig. [21 we illustrate the dependence of the distributions on size N. The (/-dependence of q e 




Fig. 2 - ln_4/3 Ij^j vs x 2 for (q,p) = (3/10, 1/2) and various system sizes N. INSET: iV-dependence 
of the (negative) slopes of the ln 9c vs x 2 straight lines. We find that, for p = 1/2 and N >> 1, 
((n - (n)) 2 ) ~ N 2 /p{N) ~ a(q)N + b(q)N 2 . For q = 1 we find a(l) = 1 and 6(1) = 0, consistent with 
normal diffusion as expected. For q < 1 we find a(q) > and b(q) > 0, thus yielding ballistic diffusion. 
The linear correlation factor of the q — log versus x 2 curves range from 0.999968 up to near 0.999971 
when N increases from 50 to 1000. The very slight lack of linerarity that is observed is expected to 
vanish in the limit N — » oo, but at the present stage this remains a numerically open question. 

is exhibited in Fig. [3J The numerical results are remarkably well described by the following 
conjecture: 

q e = 2-- (0<q<l). (7) 

q 

This of course means that we can rewrite the formula through which we introduced the 
global correlations (Eq. (5)) as follows: r N>0 = 1/ [JVp^- 1 )/^-^) - (N - i)](2-<? c )/(i-?e) ; 
with q e < 1. If we choose this way of introducing correlations, then of course only one index 
is necessary within the theory, namely g e , the index of the N — > oo attractor in the space of 
the distributions. We also notice that relation (7) can be thought of as being the composition 
of two dualities, namely the additive duality q — > (2 — q) and the multiplicative one q — > 1/q. 
These are often encountered (see, for instance, [13,15]) in the nonextensive theory ( 2 ). 

Finally we studied the dependence of p(x) on (q,p): see Fig. 4. It can be seen that 
the effect of varying either p or q is similar, namely to modify the location and height of 
the maximum of the probability distribution p(x), thus yielding skewness. This asymmetry 
reflects the particular family on which we have applied the q-product. Here we have done it 
on tnq, i.e., on the "left" side of the triangle. We could of course do it on its "right" side, or 



( 2 )These two dualities appear in fact quite naturally in the theory through the properties ln 9 (l/x) + ln2— q x = 
and gln 9 x + ln 1 / g (l/x 9 ) = 0. 
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Fig. 3 - Relation between the index g from the (/-product definition, and the index q e resulting from 
the numerically calculated probability distribution. The agreement with the analytical conjecture 
q e = 2 — i is remarkable. INSET: Detail for the range < q e < 1. 



on any other intermediate positions. This asymmetry is somewhat similar to the one which 
can occur for Levy distributions. A further study of the detailed influence of these parameters 
is currently in progress. 

Summary and discussion. - We numerically illustrated, by generalising the de Moivre- 
Laplace theorem, the ^-generalisation of the standard Central Limit Theorem for specially 
correlated variables. The correlation is based on the ^-product and is scale-invariant since 
the Leibnitz rule has been imposed. Our main result is that, for the sum of 7Y random vari- 
ables with N » 1, the distributions are neither Gaussians nor Levy distributions, but a 
different attractor distribution which, for p = 1/2 (and possibly other values of p), is a double 
branched g-Gaussian. This result strongly links the possible CLT (conjectured some time ago; 
see [9] and references therein) with nonextensive statistical mechanics. Indeed, the frequent 
occurrence in natural and artificial systems of the associated probability distributions would 
rely on this g-CLT, in the same way that the frequent occurrence of Gaussians relies on the 
standard G-CLT. Further exploration of the g-CLT is in progress, addressing among other 
things (i) the effects of varying p and of imposing the g-product elsewhere than on rjv,o! (ii) 
the results of extending the present procedure from q e < 1 to the entire region q e < 3. This 
extension will presumably require, for 1 < q e < 3, a new formula in place of Eq. (7). 

Longstanding conversations on the subject of two of us (L.G.M. and C.T.) with F. Bal- 
dovin, E.P. Borges and S.M.D. Queiros, and useful remarks from J.D. Farmer, F. Lillo, S. 
Steinberg and H. Suyari are acknowledged. We have benefitted from partial financial support 
by Pronex/MCT, Faperj and CNPq (Brazil), and SI International and AFRL (USA). M. G-M. 
was generously supported by the C.O.U.Q. Foundation and by Insight Venture Management. 
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Fig. 4 - Probability distribution p(x) for TV = 300. Left: For q — 7/10 and typical values of p (the 
asymmetry becomes evident for values of p ^ ^). Right: For p = 4/10 and typical values of q. 
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